Introduction

Here I use the random forest method to extract the power of different variables to explain growth and mortality of tree species. I use the forest inventory data base for the eastern North American forest that contains… #TODO

For each tree species in the data base (total of 32), we will run a random forest for growth and mortality with different parameters and explanatory variables. Then we evaluate the simulation based on the R\(^2\) for the growth and … #TODO We further check which variables have a better explanatory power over the response variable.

Simulations

For each tree species, we ran a random forest varying (i) the explanatory variables, and (ii) the number of variables to possibly split at in each node (mtry). The 3 sets of variables are the following:

## [1] "set 1"
##  [1] "growth"                      "dbh0"                       
##  [3] "height"                      "tree_id"                    
##  [5] "plot_id"                     "s_star"                     
##  [7] "canopyStatus"                "canopyDistance"             
##  [9] "latitude"                    "longitude"                  
## [11] "mean_temp_period_3_lag"      "min_temp_coldest_period_lag"
## [13] "min_extreme_temp"            "tot_annual_pp_lag"          
## [15] "tot_pp_period3_lag"         
## 
## [1] "set 2"
##  [1] "growth"                      "dbh0"                       
##  [3] "height"                      "canopyDistance"             
##  [5] "latitude"                    "longitude"                  
##  [7] "mean_temp_period_3_lag"      "min_temp_coldest_period_lag"
##  [9] "min_extreme_temp"            "tot_annual_pp_lag"          
## [11] "tot_pp_period3_lag"         
## 
## [1] "set 3"
## [1] "growth"                 "dbh0"                   "canopyDistance"        
## [4] "latitude"               "longitude"              "mean_temp_period_3_lag"
## [7] "tot_pp_period3_lag"

For each set of variables, mtry varied as 2, 3, 4, 5, 6, with a fixed number of 1000 trees.

Summary

Variables set

Assuming the set of variables var2 was the best to explain growth and mortality, let’s see the importance of the variables present in the set var2:

Now for the other two sets of variables (var1 and var3):

Number of variables to possibly split at in each node

Individual species response